Skip to content

🧪 Add unit tests for DiscoveredFile.from_chunk()#222

Open
bashandbone wants to merge 7 commits intomainfrom
test-discovery-from-chunk-15744345796435408751
Open

🧪 Add unit tests for DiscoveredFile.from_chunk()#222
bashandbone wants to merge 7 commits intomainfrom
test-discovery-from-chunk-15744345796435408751

Conversation

@bashandbone
Copy link
Contributor

@bashandbone bashandbone commented Mar 16, 2026

🎯 What: The testing gap addressed
Tests were missing for the conditional branches inside DiscoveredFile.from_chunk(). The logic creates a DiscoveredFile from a CodeChunk if its file_path is a valid, existing file, otherwise it raises a ValueError.

📊 Coverage: What scenarios are now tested

  1. Valid file: file_path points to a real file on disk.
  2. Invalid None: file_path is None.
  3. Invalid missing file: file_path points to a non-existent path.
  4. Invalid directory: file_path points to an existing directory instead of a file.

Result: The improvement in test coverage
src/codeweaver/core/discovery.py now has unit tests specifically checking the initialization and error raising logic for from_chunk, securing against future regressions.


PR created automatically by Jules for task 15744345796435408751 started by @bashandbone

Summary by Sourcery

Add unit tests for DiscoveredFile.from_chunk to validate behavior for valid and invalid CodeChunk file paths.

Tests:

  • Add tests covering successful DiscoveredFile creation from a CodeChunk with an existing file path.
  • Add tests asserting ValueError is raised when CodeChunk.file_path is None, points to a non-existent file, or points to a directory.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings March 16, 2026 02:45
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 16, 2026

Reviewer's Guide

Adds focused unit tests for DiscoveredFile.from_chunk(), covering successful creation from a valid CodeChunk file and all error branches for invalid file_path values (None, non-existent file, and directory).

File-Level Changes

Change Details Files
Add unit tests that exercise all branches of DiscoveredFile.from_chunk() when building from a CodeChunk.
  • Introduce test_from_chunk_valid_file to verify successful DiscoveredFile creation when file_path points to a real file on disk.
  • Introduce test_from_chunk_invalid_file to assert ValueError is raised for None file_path, non-existent file_path, and directory paths.
  • Use pytest tmp_path fixture, Span, CodeChunk, and uuid7 to construct realistic CodeChunk instances for the tests.
tests/unit/core/test_discovery.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Consider parametrizing the invalid file_path cases instead of grouping them into a single test_from_chunk_invalid_file to make failures more targeted and the test intent clearer.
  • If source_id does not need to be unique per test, you could replace uuid7() with a deterministic value to reduce dependency on external utilities and make the tests slightly simpler.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider parametrizing the invalid `file_path` cases instead of grouping them into a single `test_from_chunk_invalid_file` to make failures more targeted and the test intent clearer.
- If `source_id` does not need to be unique per test, you could replace `uuid7()` with a deterministic value to reduce dependency on external utilities and make the tests slightly simpler.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds unit tests to cover the success and failure branches of DiscoveredFile.from_chunk() to prevent regressions in file path validation.

Changes:

  • Added a test for creating a DiscoveredFile from a chunk pointing to a real file.
  • Added tests asserting ValueError for None, non-existent paths, and directory paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

from codeweaver.core.spans import Span
from codeweaver.core.discovery import DiscoveredFile
from codeweaver.core.utils import uuid7

Comment on lines +25 to +49
def test_from_chunk_invalid_file(tmp_path: Path):
"""Test that creating a DiscoveredFile from a CodeChunk fails when the file_path is invalid."""
# Condition 1: file_path is None
chunk_no_path = CodeChunk(
content="print('hello')",
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=None
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
DiscoveredFile.from_chunk(chunk_no_path)

# Condition 2: file_path points to a non-existent file
chunk_bad_path = CodeChunk(
content="print('hello')",
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=tmp_path / "does_not_exist.py"
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
DiscoveredFile.from_chunk(chunk_bad_path)

# Condition 3: file_path points to an existing directory instead of a file
chunk_dir = CodeChunk(
content="print('hello')",
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=tmp_path
google-labs-jules bot and others added 2 commits March 16, 2026 03:08
…b Action syntax

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 16, 2026 03:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds unit tests to cover DiscoveredFile.from_chunk() success and failure branches, and includes a few unrelated robustness tweaks in configuration/provider client modules.

Changes:

  • Added unit tests for DiscoveredFile.from_chunk() covering valid path and multiple invalid path scenarios.
  • Made fastembed type import more defensive to avoid runtime import failures.
  • Hardened CLI config display against missing settings keys; updated Claude workflow allow-list configuration.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/unit/core/test_discovery.py Adds unit tests covering DiscoveredFile.from_chunk() valid/invalid file_path branches.
src/codeweaver/providers/config/clients/multi.py Wraps fastembed.common.types import in try/except with fallback type.
src/codeweaver/cli/commands/config.py Uses safer .get() access for optional config fields in CLI display.
.github/workflows/claude.yml Expands allowed_non_write_users to include claude[bot].
test_fix.patch Patch artifact checked into repo (should likely not be committed).
patch_config.patch Patch artifact checked into repo (should likely not be committed).
patch_config2.patch Patch artifact checked into repo (should likely not be committed).
src/codeweaver/providers/config/clients/multi.py.orig Backup/original file checked into repo (should likely not be committed).
src/codeweaver/cli/commands/config.py.orig Backup/original file checked into repo (should likely not be committed).
src/codeweaver/cli/commands/config.py.rej Reject file checked into repo (should likely not be committed).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +11
--- src/codeweaver/cli/commands/config.py
+++ src/codeweaver/cli/commands/config.py
@@ -103,7 +103,7 @@
table.add_row("Project Path", str(settings["project_path"]))
table.add_row("Project Name", settings["project_name"] or "auto-detected")
table.add_row("Token Limit", str(settings["token_limit"]))
- table.add_row("Max File Size", f"{settings['max_file_size']:,} bytes")
+ table.add_row("Max File Size", f"{settings.get('max_file_size', 0):,} bytes")
table.add_row("Max Results", str(settings["max_results"]))

# Feature flags
Comment on lines +25 to +52
def test_from_chunk_invalid_file(tmp_path: Path):
"""Test that creating a DiscoveredFile from a CodeChunk fails when the file_path is invalid."""
# Condition 1: file_path is None
chunk_no_path = CodeChunk(
content="print('hello')",
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=None
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
DiscoveredFile.from_chunk(chunk_no_path)

# Condition 2: file_path points to a non-existent file
chunk_bad_path = CodeChunk(
content="print('hello')",
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=tmp_path / "does_not_exist.py"
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
DiscoveredFile.from_chunk(chunk_bad_path)

# Condition 3: file_path points to an existing directory instead of a file
chunk_dir = CodeChunk(
content="print('hello')",
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=tmp_path
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
DiscoveredFile.from_chunk(chunk_dir)
Comment on lines +55 to 57
OnnxProvider = object
else:
OnnxProvider = object
Comment on lines +111 to +115
if settings.get("indexer") and settings["indexer"].get("only_index_on_command")
and not isinstance(settings.get("indexer", {}).get("only_index_on_command"), Unset)
else "✅",
)
table.add_row("Telemetry", "❌" if settings["telemetry"].get("disable_telemetry") else "✅")
table.add_row("Telemetry", "❌" if settings.get("telemetry") and settings["telemetry"].get("disable_telemetry") else "✅")
google-labs-jules bot and others added 2 commits March 16, 2026 04:24
…ck and missing config keys on dict views

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
…ck and missing config keys on dict views

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 16, 2026 04:31
…ck and missing config keys on dict views

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds unit test coverage for DiscoveredFile.from_chunk() while also including several unrelated housekeeping and workflow/config changes.

Changes:

  • Add unit tests covering valid and invalid CodeChunk.file_path scenarios for DiscoveredFile.from_chunk().
  • Make CLI config display more defensive when optional settings keys are missing.
  • Adjust provider config typing fallback and broaden Claude workflow trigger permissions.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/unit/core/test_discovery.py Adds new unit tests for DiscoveredFile.from_chunk() branches.
src/codeweaver/cli/commands/config.py Makes config rendering tolerate missing keys (e.g., max_file_size, indexer, telemetry).
src/codeweaver/providers/config/clients/multi.py Guards fastembed type import with try/except ImportError.
.github/workflows/claude.yml Broadens who can trigger Claude action and adds bot/user allowlists.
test_fix.patch Patch artifact committed to repo (appears unintended).
patch_config.patch / patch_config2.patch / patch_claude.patch Patch artifacts committed to repo (appear unintended).
src/codeweaver/providers/config/clients/multi.py.orig Backup artifact committed to repo (appears unintended).
src/codeweaver/cli/commands/config.py.orig / config.py.rej Backup/reject artifacts committed to repo (appear unintended).
.github/workflows/claude.yml.orig Backup artifact committed to repo (appears unintended).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
github_token: ${{ secrets.GITHUB_TOKEN }}
allowed_non_write_users: Copilot
allowed_non_write_users: "Copilot,claude[bot]"
Comment on lines +72 to 73
allowed_non_write_users: "Copilot,claude[bot]"
allowed_bots: "github-actions[bot],copilot[bot],dependabot[bot],copilot,github-actions,gemini[bot],claude[bot]"
Comment on lines +1 to +12
--- src/codeweaver/providers/config/clients/multi.py
+++ src/codeweaver/providers/config/clients/multi.py
@@ -50,6 +50,9 @@

if has_package("fastembed") is not None or has_package("fastembed_gpu") is not None:
- from fastembed.common.types import OnnxProvider
+ try:
+ from fastembed.common.types import OnnxProvider
+ except ImportError:
+ OnnxProvider = Any
else:
OnnxProvider = Any
Comment on lines +111 to +115
if settings.get("indexer") and settings["indexer"].get("only_index_on_command")
and not isinstance(settings.get("indexer", {}).get("only_index_on_command"), Unset)
else "✅",
)
table.add_row("Telemetry", "❌" if settings["telemetry"].get("disable_telemetry") else "✅")
table.add_row("Telemetry", "❌" if settings.get("telemetry") and settings["telemetry"].get("disable_telemetry") else "✅")
Comment on lines +20 to +23
discovered_file = DiscoveredFile.from_chunk(chunk)

assert isinstance(discovered_file, DiscoveredFile)
assert discovered_file.path.name == "valid_file.py"
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=None
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=tmp_path / "does_not_exist.py"
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
line_range=Span(start=1, end=1, source_id=uuid7()),
file_path=tmp_path
)
with pytest.raises(ValueError, match="CodeChunk must have a valid file_path"):
…ncy check and missing config keys on dict views

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants